17 research outputs found

    An accurate retrieval through R-MAC+ descriptors for landmark recognition

    Full text link
    The landmark recognition problem is far from being solved, but with the use of features extracted from intermediate layers of Convolutional Neural Networks (CNNs), excellent results have been obtained. In this work, we propose some improvements on the creation of R-MAC descriptors in order to make the newly-proposed R-MAC+ descriptors more representative than the previous ones. However, the main contribution of this paper is a novel retrieval technique, that exploits the fine representativeness of the MAC descriptors of the database images. Using this descriptors called "db regions" during the retrieval stage, the performance is greatly improved. The proposed method is tested on different public datasets: Oxford5k, Paris6k and Holidays. It outperforms the state-of-the- art results on Holidays and reached excellent results on Oxford5k and Paris6k, overcame only by approaches based on fine-tuning strategies

    Efficient Nearest Neighbors Search for Large-Scale Landmark Recognition

    Full text link
    The problem of landmark recognition has achieved excellent results in small-scale datasets. When dealing with large-scale retrieval, issues that were irrelevant with small amount of data, quickly become fundamental for an efficient retrieval phase. In particular, computational time needs to be kept as low as possible, whilst the retrieval accuracy has to be preserved as much as possible. In this paper we propose a novel multi-index hashing method called Bag of Indexes (BoI) for Approximate Nearest Neighbors (ANN) search. It allows to drastically reduce the query time and outperforms the accuracy results compared to the state-of-the-art methods for large-scale landmark recognition. It has been demonstrated that this family of algorithms can be applied on different embedding techniques like VLAD and R-MAC obtaining excellent results in very short times on different public datasets: Holidays+Flickr1M, Oxford105k and Paris106k

    A location-aware embedding technique for accurate landmark recognition

    Full text link
    The current state of the research in landmark recognition highlights the good accuracy which can be achieved by embedding techniques, such as Fisher vector and VLAD. All these techniques do not exploit spatial information, i.e. consider all the features and the corresponding descriptors without embedding their location in the image. This paper presents a new variant of the well-known VLAD (Vector of Locally Aggregated Descriptors) embedding technique which accounts, at a certain degree, for the location of features. The driving motivation comes from the observation that, usually, the most interesting part of an image (e.g., the landmark to be recognized) is almost at the center of the image, while the features at the borders are irrelevant features which do no depend on the landmark. The proposed variant, called locVLAD (location-aware VLAD), computes the mean of the two global descriptors: the VLAD executed on the entire original image, and the one computed on a cropped image which removes a certain percentage of the image borders. This simple variant shows an accuracy greater than the existing state-of-the-art approach. Experiments are conducted on two public datasets (ZuBuD and Holidays) which are used both for training and testing. Morever a more balanced version of ZuBuD is proposed.Comment: 6 pages, 5 figures, ICDSC 201

    Content-based image retrieval for visual big data analysis

    No full text
    The Content-Based Image Retrieval (CBIR) task is a computer vision problem. The growth of the digital images on the Internet allows to encourage the proposal of solution for this task more than before. The access to this huge quantity of data has allowed the creation of big datasets, that brought with them lots of new challenges. Briefly, the objective of the task is simply to retrieve and rank the similar images to the query one, called retrieval accuracy, that need to be as high as possible. Moreover, there are secondary targets as retrieval time and memory occupancy that need to be as low as possible. The problem is trivial for humans that simply execute this task through experience and semantic perception, but it is not so easy for a computer. This is known as semantic gap, which refers to the gap between low-level image pixels and high-level semantic concepts. Furthermore, the images may contain noisy patches (e.g. trees, person, cars, ...), be taken with different lightning conditions, viewpoints and resolution. In order to solve this problem it is crucial to develop algorithms and techniques with the objective of reducing the weight of the unnecessary patches of the images and that work well with a vast quantity of data. There are several applications of CBIR systems: libraries and museum applications, fashion application for the search of certain clothes, advanced electronic tourist guides. In this thesis a complete pipeline for the resolution of the CBIR problem is presented and then all the steps of the process are evaluated with a particular focus on CNN transfer learning, embeddings, large-scale retrieval and methods based on graphs as diffusion mechanism. All the methods presented are tested on several public image datasets in order to compare the final retrieval results

    An accurate retrieval through R-MAC+ descriptors for landmark recognition

    No full text
    The landmark recognition problem is far from being solved, but with the use of features extracted from intermediate layers of Convolutional Neural Networks (CNNs), excellent results have been obtained. In this work, we propose some improvements on the creation of R-MAC descriptors in order to make the newly-proposed R-MAC+ descriptors more representative than the previous ones. However, the main contribution of this paper is a novel retrieval technique, that exploits the fine representativeness of the MAC descriptors of the database images. Using this descriptors called "db regions" during the retrieval stage, the performance is greatly improved. The proposed method is tested on different public datasets: Oxford5k, Paris6k and Holidays. It outperforms the state-of-theart results on Holidays and reached excellent results on Oxford5k and Paris6k, overcame only by approaches based on fine-tuning strategies

    Landmark Recognition: From Small-Scale to Large-Scale Retrieval

    No full text
    During the last years, the problem of landmark recognition is addressed in many different ways. Landmark recognition is related to finding the most similar images to a starting one in a particular dataset of buildings or places. This chapter explains the most used techniques for solving the problem of landmark recognition, with a specific focus on techniques based on deep learning. Firstly, the focus is on the classical approaches for the creation of descriptors used in the content-based image retrieval task. Secondly, the deep learning approach that has shown overwhelming improvements in many tasks of computer vision, is presented. A particular attention is put on the major recent breakthroughs in Content-Based Image Retrieval (CBIR), the first one is transfer learning which improves the feature representation and therefore accuracy of the retrieval system. The second one is the fine-tuning technique, that allows to highly improve the performance of the retrieval system, is presented. Finally, the chapter exposes the techniques for large-scale retrieval, in which datasets contain at least a million images

    Efficient Nearest Neighbors Search for Large-Scale Landmark Recognition

    No full text
    The problem of landmark recognition has achieved excellent results in small-scale datasets. Instead, when dealing with large-scale retrieval, issues that were irrelevant with small amount of data, quickly become fundamental for an efficient retrieval phase. In particular, computational time needs to be kept as low as possible, whilst the retrieval accuracy has to be preserved as much as possible. In this paper we propose a novel multi-index hashing method called Bag of Indexes (BoI) for Approximate Nearest Neighbors (ANN) search. It allows to drastically reduce the query time and outperforms the accuracy results compared to the state-of-the-art methods for large-scale landmark recognition. It has been demonstrated that this family of algorithms can be applied on different embedding techniques like VLAD and R-MAC obtaining excellent results in very short times on different public datasets: Holidays+Flickr1M, Oxford105k and Paris106k

    Diffusion Parameters Analysis in a Content-Based Image Retrieval Task for Mobile Vision

    Get PDF
    Most recent computer vision tasks take into account the distribution of image features to obtain more powerful models and better performance. One of the most commonly used techniques to this purpose is the diffusion algorithm, which fuses manifold data and k-Nearest Neighbors (kNN) graphs. In this paper, we describe how we optimized diffusion in an image retrieval task aimed at mobile vision applications, in order to obtain a good trade-off between computation load and performance. From a computational efficiency viewpoint, the high complexity of the exhaustive creation of a full kNN graph for a large database renders such a process unfeasible on mobile devices. From a retrieval performance viewpoint, the diffusion parameters are strongly task-dependent and affect significantly the algorithm performance. In the method we describe herein, we tackle the first issue by using approximate algorithms in building the kNN tree. The main contribution of this work is the optimization of diffusion parameters using a genetic algorithm (GA), which allows us to guarantee high retrieval performance in spite of such a simplification. The results we have obtained confirm that the global search for the optimal diffusion parameters performed by a genetic algorithm is equivalent to a massive analysis of the diffusion parameter space for which an exhaustive search would be totally unfeasible. We show that even a grid search could often be less efficient (and effective) than the GA, i.e., that the genetic algorithm most often produces better diffusion settings when equal computing resources are available to the two approaches. Our method has been tested on several publicly-available datasets: Oxford5k, ROxford5k, Paris6k, RParis6k, and Oxford105k, and compared to other mainstream approaches

    An Efficient Approximate kNN Graph Method for Diffusion on Image Retrieval

    No full text
    The application of the diffusion in many computer vision and artificial intelligence projects has been shown to give excellent improvements in performance. One of the main bottlenecks of this technique is the quadratic growth of the kNN graph size due to the high-quantity of new connections between nodes in the graph, resulting in long computation times. Several strategies have been proposed to address this, but none are effective and efficient. Our novel technique, based on LSH projections, obtains the same performance as the exact kNN graph after diffusion, but in less time (approximately 18 times faster on a dataset of a hundred thousand images). The proposed method was validated and compared with other state-of-the-art on several public image datasets, including Oxford5k, Paris6k, and Oxford105k
    corecore